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Abstract 

We consider the two- variable fragment FO 2 [<] of first-order logic over 
finite words. Numerous characterizations of this class are known. The- 
rien and Wilke have shown that it is decidable whether a given regular 
language is definable in F0 2 [<]. From a practical point of view, as 
shown by Weis, F0 2 [<] is interesting since its satisfiability problem 

vq \ is in NP. Restricting the number of quantifier alternations yields an 

infinite hierarchy inside the class of F0 2 [<]-definable languages. We 
show that each level of this hierarchy is decidable. For this purpose, 

^j . we relate each level of the hierarchy with a decidable variety of finite 

monoids. 

Our result implies that there are many different ways of climb- 
ing up the F0 2 [<]-quantifier alternation hierarchy: deterministic and 

1^^ ■ co-deterministic products, Mal'cev products with definite and reverse 

3 \ definite semigroups, iterated block products with ^/-trivial monoids, 

and some inductively defined omega-term identities. A combinatorial 
tool in the process of ascension is that of condensed rankers, a refine- 
ment of the rankers of Weis and Immerman and the turtle programs 
of Schwcntick, Therien, and Vollmcr. 
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1 Introduction 

The investigation of logical fragments has a long history. McNaughton and 
Papert [16] showed that a language over finite words is definable in first- 
order logic FO[<] if and only if it is star- free. Combined with Schiitzen- 
berger's characterization of star-free languages in terms of finite aperiodic 
monoids [22], this leads to an algorithm to decide whether a given regular 
language is first-order definable. Many other characterizations of this class 
have been given over the past 50 years, see [3] for an overview. Moreover, 
mainly due to its relation to linear temporal logic [7], it became relevant to 
a large number of application fields, such as verification. 

Very often one is interested in fragments of first-order logic. From a 
practical point of view, the reason is that smaller fragments often yield 
more efficient algorithms for computational problems such as satisfiability. 
For example, satisfiability for FO[<] is non-elementary [25], whereas the 
satisfiability problem for first-order logic with only two variables is in NP, 
cf. [38]. And on the theoretical side, fragments form the basis of a descrip- 
tive complexity theory inside the regular languages: the simpler a logical 
formula defining a language, the easier the language. Moreover, in contrast 
to classical complexity theory, in some cases one can actually decide whether 
a given language has a particular property. From both the practical and the 
theoretical point of view, several natural hierarchies have been considered 
in the literature: the quantifier alternation hierarchy inside FO[<] which 
coincides with the Straubing-Therien hierarchy [261 131] , the quantifier al- 
ternation hierarchy inside F0[<,+1] with a successor predicate +1 which 
coincides with the dot-depth hierarchy [21 [35], the until hierarchy of tempo- 
ral logic [33], and the until-since hierarchy [34]. Decidability is known for 
the levels of the until and the the until-since hierarchies, and only for the 
very first levels of the alternation hierarchies, see e.g. [H 20] . 

Fragments are usually defined by restricting resources in a formula. Such 
resources can be the predicates which are allowed, the quantifier depth, the 
number of quantifier alternations, or the number of variables. When the 
quantifier depth is restricted, only finitely many languages are definable over 
a fixed alphabet: decidability of the membership problem is not an issue in 
this case. When restricting the number of variables which can be used (and 
reused), then first-order logic FO [<] with three variables already has the 
full expressive power of FO[<], see 0E]. On the other hand, first-order logic 
FO [<] with only two variables defines a proper subclass. The languages 
definable in FO [<] have a huge number of different characterizations, see 
e.g. [U [291 [30]. For example, F0 2 [<] has the same expressive power as A2[<]; 



the latter is a fragment of F0[<] with two blocks of quantifiers 

Turtle programs are one of these numerous descriptions of FO [^-defin- 
able languages [23]. They are sequences of instructions of the form "go to 
the next a-position" and "go to the previous a-position". Using the term 
ranker for this concept and having a stronger focus on the order of positions 
defined by such sequences, Weis and Immerman [39] were able to give a 
combinatorial characterization of the alternation hierarchy FO m [<] inside 
F0 2 [<]. Straubing [27J gave an algebraic characterization of FO TO [<]. But 
neither result yields the decidability of FO m [<]-definability for m > 2. In 
some sense, this is the opposite of a previous result of the authors [14, Thm. 
6.1], who give necessary and sufficient conditions which helped to decide the 
FO m [<]-hierarchy with an error of at most one. In this paper we give a new 
algebraic characterization of FO m [<], and this characterization immediately 
yields decidability. 

The algebraic approach to the membership problem of logical fragments 
has several advantages. In favorable cases, it opens the road to decidability 
procedures. Moreover, it allows a more semantic comparison of fragments; 
for example, the equality F0 2 [<] = A2[<] was obtained by showing that 
both FO [<] and A2[<] correspond to the same variety of finite monoids, 
namely DA [211 152] . 

Building on previous detailed knowledge of the lattice of band varieties 
(varieties of idempotent monoids), Trotter and Weil defined a sub-lattice of 
the lattice of subvarieties of DA [36], which we call the R m -Lm-hierarchy. 
These varieties have many interesting properties and in particular, each 
R m (resp. Lm) is efficiently decidable (by a combination of results of Trot- 
ter and Weil [36], Kufleitner and Weil [ID] , and Straubing and Weil [28] . 
see Section [3J for more details). Moreover, one can climb up the R m -Lm- 
hierarchy algebraically, using Mal'cev products, see [10] and Section [2]below; 
language-theoretically, in terms of alternated closures under deterministic 
and co-deterministic products [181114]; and combinatorially using condensed 
rankers, see [T31 15] arid Section [2J 

We relate the FO [<] quantifier alternation hierarchy with the Rm-Lm- 
hierarchy. More precisely, the main result of this paper is that a lan- 
guage is definable in FO m [<] if and only if it is recognized by a monoid in 
R m+ i PI Lm + 1, thus establishing the decidability of each FO m [<]. This 
result was first conjectured in [13], where one inclusion was established. 
Our proof combines a technique introduced by Klima [8] and a substitu- 
tion idea [11] with algebraic and combinatorial tools inspired by [14] . The 
proof is by induction and the base case is Simon's Theorem on piecewise 
testable languages [24] . 




ai > a.2 



Figure 1: The positions defined by r in u, when r = X ai X a2 X a3 X a4 Y a5 Y a6 X a7 
is condensed on u 

2 Preliminaries 

Let A be a finite alphabet and let A* be the set of all finite words over A. 
The length \u\ of a word u = a\ ■ ■ ■ a n , ai G A, is n and its alphabet is 
alph(n) = {01, . . . ,a n } C A A position i of u = a\ ■ ■ ■ a n is an a-position 
if ai = a. A factorization u = u-au + is the a-left factorization of it if 
a alph(u_), and it is the a-right factorization if a ^ alph(u + ), i.e., we 
factor at the first or at the last a-position. 

2.1 Rankers 

A ranker is a nonempty word over the alphabet {X a , Y a | a € A}. It is inter- 
preted as a sequence of instructions of the form "go to the next a-position" 
and "go to the previous a-position" . More formally, for u = a\ ■ ■ ■ a n 6i* 
and x € {0, . . . , n + 1} we let 

X a (u,x) = min{y [ y > x and a y = a} , X a (u) = X a (u,0), 

Y a (-u, x) = max {y \ y < x and a y = a} , Y a (it) = Y a (u, n + 1). 

Here, both the minimum and the maximum of the empty set are undefined. 
The modality X a is for "neXt-a" and Y a is for "Yesterday-a" . For r = Zs, 
Z G {X a , Y a | a £ A}, we set 

r(u,x) = s(u,Z(u,x)), r{u) = s(u,Z(u)). 

In particular, rankers are executed (as a set of instructions) from left to 
right. Every ranker r either defines a unique position in a word u, or it 
is undefined on u. For example, X a YfeX c (6ca) = 2 and X a YbX c (6ac) = 

3 whereas X a YfcX c (ca6c) and X a Y^X c (6c6a) are undefined. A ranker r is 
condensed on n if it is defined and, during the execution of r, no previously 
visited position is overrun |14j . One can think of condensed rankers as 
zooming in on the position they define, see Figure [TJ More formally r = 
Zi • • • Zfc, Zj € {X a , Y a | a € A}, is condensed on tt if there exists a chain of 
open intervals 

(0; |u| + 1) = (x ; y ) D (xi;yi) D • • • D (z„_i; y n -i) 3 r(u) 



such that for all 1 < £ < n — 1 the following properties are satisfied: 

• If Z^Z m = X a X fe , then {x e ;y e ) = (X a (u,a^_i);^_i). 

• H Z £ Z £+1 = Y a Y fe , then (x e ;y e ) = (x e -i; Y a (u, yi-x). 

• If Z^Z m = X a Y fe , then (x^;^) = (a^_i;X (u,a^_i)). 

• If Z^Z m = Y a X 6 , then (x^;^) = (Y a («,y^_i);^_i). 
For example, X a Yj, X c is condensed on bca but not on bac. 

The depth of a ranker is its length as a word. A block of a ranker is a 
maximal factor of the form X ai • • • X afe or of the form Yf, 1 • • • Y^ . A ranker 
with m blocks changes direction m — 1 times. By R m ,n we denote the class 
of all rankers with depth at most n and with up to m blocks. We write R„ t n 
for the set of all rankers in R m ,n which start with an X a -modality and we 
write Rm n for all rankers in R m ,n which start with a Y a -modality. 

We define u > m>n v if the same rankers in i?* n U R^-i n -\ are condensed 
on u and v. Similarly, u < m ,n v if the same rankers in R^ n U i?*_i „_i 
are condensed on u and v. The relations > m>T1 and < mjn are finite index 
congruences [HJ Lem. 3.13]. 

The order type ord(i,j) is one of {<, =, >}, depending on whether i < j, 
i = j, or i > j, respectively. We define u = m ,n v if 

• the same rankers in R m ^ n are defined on u and v, 

• for all r G -R* n and s G R^ nn _ l : ord(r(u), s(u)) = ord(r(v),s(v)), 

• for all r G -Rm n and s & R% in _ 1 : ord(r(u),s(u)) = oid(r(v),s(v)), 

• for all r € -R* n and s € Rm-in-i : ord(r(u),s(u)) = ord(r(v),s(v)), 

• for all r G i?^ n and s G fi^_ ln _ 1 : ord(r(n), s(u)) = ord(r(v),s(v)). 

Remark 1. For m = 1, each of the families (=i jn )n> (>i )n )n, and (<i,n)n 
defines the class of piecewise testable languages, see e.g. [51 [21]. Recall that 
a language L C ^4* is piecewise testable if it is a Boolean combination of 
languages of the form j4*oiA* • • • a^A* (k > 0, a%, . . . , a^ G A). 

2.2 First-order Logic 

We denote by FO[<] the first-order logic over words interpreted as labeled 
linear orders. The atomic formulas are T (for true), _l_ (for false), the unary 
predicates a(x) (one for each a £ A), and the binary predicate x < y for 
variables x and y. Variables range over the linearly ordered positions of a 
word and a(x) means that x is an a-position. Apart from the Boolean con- 
nectives, we allow composition of formulas using existential quantification 
3x: ip and universal quantification Vx: <p for ip G FO[<]. The semantics 
is as usual. A sentence in FO[<] is a formula without free variables. For 



a sentence ip the language defined by tp, denoted by L(<p), is the set of all 
words bgA* which model ip. 

The fragment FO [<] of first-order logic consists of all formulas which use 
at most two different names for the variables. This is a natural restriction, 
since FO with three variables already has the full expressive power of FO. 
A formula ip € FO [<] is in FO m [<] if, on every path of its parse tree, (p has 
at most m blocks of alternating quantifiers. 

Note that FOf [<]-definable languages are exactly the piecewise testable 
languages, cf. [27j . For m > 2, we rely on the following important result, 
due to Weis and Immerman [39^ Thm. 4.5]. 

Theorem 2. A language L is definable in FO m [<] if and only if there exists 
n £ N such that L is a union of = m ,n- classes. 

Remark 3. The definition of = m ,n above is formally different from the 
conditions in Weis and Immerman's |39^ Thm. 4.5]. A careful but elementary 
examination reveals that they are actually equivalent. 

2.3 Algebra 

A monoid M recognizes a language L C A* if there exists a morphism 
ip : A* — > M such that L = ip~ 1 ip{L). If ip : A* — > M is a morphism, 
then we set u = v v if <p{u) = <p(v). The join =i V =2 of two congruences 
=1 and =2 is the least congruence containing =1 and =2. An element u is 
idempotent if u 2 = u. The set of all idempotents of a monoid M is denoted 
by E{M). For every finite monoid M there exists ui E N such that u u is 
idempotent for all u € M. Green's relations J , 7Z, and C are an important 
concept to describe the structural properties of a monoid M: we set u <j v 
(resp. u <ti v, u <c v) if u = pvq (resp. u = vq, u = pv) for some p,q € M. 
We also define u J v (resp. u 1Z v, u C v) if u <j v and v <j u (resp. 
u <ti v and v <ti u, u <c v and v <c u). A monoid M is J -trivial (resp. 
IZ-trivial, C-trivial) if J (resp. 1Z, C) is the identity relation on M. We 
define the relations ~k> ~d, and ~li on M as follows: 

• u ~K v if and only if, for all e € E(M), we have either eu, ev <j e, or 
eu = ev. 

• u ~d v if and only if, for all / G E(M), we have either uf,vf <j /, 
or uf = vf. 

• u ~li v if and only if, for all e, / € E{M) such that e J f, we have 
either euf,evf <j e, or euf = evf. 

The relations ~k , ~D and ~li are congruences [9] . If V is a class of finite 
monoids, we say that a monoid M is in K © V (resp. D © V, LI © V) if 
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R = R-2 *C ^* L2 = L 

J = R 2 n L2T 

R x = LI = J 

Figure 2: The R m -Lm- hierarchy 

M/~ K S V (resp. M/~ D e V, M/~ m e V). The classes K © V, D © V 
and LI © V are called Mal'cev products and they are usually defined in 
terms of relational morphisms. In the present context however, the definition 
above will be sufficient [9] , see [5] . We will need the following classes of finite 
monoids: 

• Ji consists of all finite commutative monoids satisfying x 2 = x. 

• J (resp. R, L) consists of all finite ^7-trivial (resp. 7^-trivial, /^-trivial) 
monoids. 

• A consists of all finite monoids satisfying x u+l = x^ . Monoids in A 
are called aperiodic. 

• DA consists of all finite monoids satisfying (xy) u 'x(xy) ul = {xy) u . 

• Ri = LI = J, R m +i = K © Lm, Lm + 1 = D © R m . 
It is well known that 

DA = LI © Ji, R 2 = R, L2 = L, R n L = J, and 

R m U Lm C R m+1 n Lm + 1 C DA C A 

see e.g. [19]. The R m -Lm-hierarchy is depicted in Figure EJ 

2.4 The variety approach to the decidability of FO^[<] 

Classes of finite monoids that are closed under taking submonoids, homo- 
morphic images and finite direct products are called pseudovarieties. The 



classes of finite monoids Ji, J, A, DA, R m and Lm introduced above are 
all pseudovarieties. 

If V is a pseudovariety of monoids, the class V of languages recognized by 
a monoid in V is called a variety of languages. Eilenberg's variety theorem 
(see e.g. [171 Annex B]) shows that varieties of languages are characterized 
by natural closure properties, and that the correspondence V i— > V is onto. 
Elementary automata theory shows in addition that a language L is recog- 
nized by a monoid in a pseudovariety V if and only the syntactic monoid of 
L is in V. It follows that if V has a decidable membership problem, then 
so does the corresponding variety of languages V. 

Simon's Theorem on piecewise testable languages [El [24] is an important 
instance of this Eilenberg correspondence: a language L is recognizable by a 
monoid in J if and only if L is piecewise testable (and hence, as we already 
observed, if and only if L is definable in F0 1 [<]). Simon's result implies the 
decidability of piecewise testability. 

It immediately follows from the definition that membership in R m and 
Lm is decidable for all m since membership in J is decidable (see Corol- 
lary 023 for a more precise statement). Many additional properties of the 
pseudovarieties R m and Lm, and of the corresponding varieties of languages 
were established by the authors [101 HH [36]. We will use in particular the 
following results, respectively [HI Cor. 3.15] and [101 Thms. 2.1 and 3.5]. 

Proposition 4. An A-generated monoid M is in R m (resp. Lm) if and 
only if there exists an integer n such that M is a quotient of A* /> m ,n (resp. 

A l^m^nj- 

Let xi, X2, ... be a sequence of variables. For each word u, we denote by 
u the mirror image of u, that is, the word obtained by reading u from right 
to left. Let G2 = x%x\, I2 = x 2 x\x 2 and, for m > 2, G m +\ = x m+ iG m and 
I-m+i = G m+ ix m+ il m . Finally, let 99 be the substitution given by 

(p(xi) = {x^x^XiY, cp(x 2 ) = x%, 

and, for m > 2, cp{x m+1 ) = «„ +1 (y9(G m G m ) w x^ +1 ) w . 

Proposition 5. R m (resp. Lm) is the class of finite monoids satisfying 
(xyYx{xyY = {xyY and tp(G m ) = ip{I m ) (resp. <p(G m ) = <p{I m ). 

Straubing [27] and Kufleitner and Lauser [X21 Cor. 3.4] established, by 
different means, that for each m > 1, the class of FO m [<]-definable lan- 
guages forms a variety of languages, and we denote by FO^ the correspond- 
ing pseudovariety. In particular, F0 1 = J. Our strategy to establish the 



decidability of FO^J<]-definability, is to establish the decidability of mem- 
bership in FO^. 

It is to be noted that neither Straubing's result, nor Kufleitner's and 
Lauser's result implies the decidability of FO^. Straubing's result is the 
following (27J Thm. 4]. 

Theorem 6. For m > 1, FO^ +1 = FO^ ** 3, where ** denotes the two- 
sided wreath product. 

We refer the reader to [27J for the definition of the two-sided wreath prod- 
uct, which is also called the block product in the literature. As discussed 
by Straubing, this exact algebraic characterization of FO m implies the de- 
cidability of F0 2 but not of the other levels of the hierarchy. Straubing 
however conjectured that the following holds [271 Conj. 10]. 

Conjecture 7 (Straubing). Letu\ = {xix 2 ) Ui , v\ = {x 2 x\) u and, form > 1, 

U m +1 = (XI ■ ■ ■ X 2n X2n+lT 'u n (x2n+2Xl • • • X 2 nT 
V m+1 = (X\ ■ ■ ■ X 2n X2n+lY V n (x 2n +2Xl • • • X 2 nT . 

Then a monoid is in FO m if and only if it satisfies x^ +l = x w and u m = v m . 

If established, this conjecture would prove the decidability of each FO^. 
The authors on the other hand proved the following |14} Thm. 5.1]. 

Theorem 8. If a language L is recognized by a monoid in the join R m V Lm, 
then L is definable in FO m [<]; and if L is definable in FO^J<], then L is 
recognized by a monoid in R m +i n Lm + 1. 

3 The FO alternation hierarchy is decidable 

We tighten the connection between the alternation hierarchy within F0 2 [<] 
and the R. m -Lm-liierarchy and we prove the following result. 

Theorem 9. A language L C. A* is definable in FO^[<] if and only if it is 
recognizable by a monoid in R m +i n Lm + 1. 

Theorem [9] immediately yields a decidability result. 

Corollary 10. For each m > 1, it is decidable whether a given regular lan- 
guage L is FO m [<] -definable. This decision can be achieved in Logspace on 
input the multiplication table of the syntactic monoid of L, and in Pspace 
on input its minimal automaton. 

Moreover, given a FO [<]- definable language L, one can compute the 
least integer m such that L is FO^J<]. 



Proof. We already observed that the R m and Lm are decidable, and that 
each is described by two omega-term identities (Proposition [5]). The de- 
cidability statement follows immediately. The complexity statement is a 
consequence of Straubing and Weil's J2SJ Thm. 2.19]. The computability 
statement follows immediately. □ 

We now turn to the proof of Theorem [9j One implication was established 
in Theorem To prove the reverse implication, we prove Proposition [TT] 
below, which establishes that every language recognized by a monoid M € 
R m+ i n Lm + 1 is a union of = min -classes for some integer n depending on 
M. Theorem [9] follows, in view of Theorem [2 

Proposition 11. For every m > 1 and every morphism <p: A* — > M with 
M E R m +i n Lm + 1 there exists an integer n such that = m ,n is contained 

in = ip . 

Before we embark in the proof of Proposition II 11 we record several alge- 
braic and combinatorial lemmas. 

3.1 A collection of technical lemmas 

Lemma 12. Let M be a finite monoid. If s TZ sx and x ~k V, then sx = sy. 
If s £ xs and x ~d V, then xs = ys. 

Proof. Let z € M such that sxz = u. We have (xz) u x J (xz) w . Now, 
x ~k y implies (xz^x = (xz) u y. Thus sx = s{xz) u x = s{xz) UJ y = sy. The 
second statement is left-right symmetric. □ 

The following lemma illustrates an important structural property of 
monoids in DA. 

Lemma 13. Let (p: A* — >• M, with M € DA and let x,y,z € A* such that 
ip(x) 1Z (f(xy) and alph(z) C alph(y). Then <p(x) 1Z ip(xz). 

Proof. The map alph: A* — > V(A) can be seen as a morphism, where the 
product on V(A) is the union operation. Since M € DA, we have M/~li € 
Ji; let 7r: M — > M/~li be the projection morphism. It is easily verified 
that there exists a morphism tp : V(A) — > M/~li such that ip o alph = ir o <p, 
see Figure El 

By assumption, cp(x) = cp{xyt) for some t & A*, and hence ip(x) = 
ip(x)ip(yt) u) . Since a\ph((yt) u ) = alph((yt) w z(yt) w ), we have p{ytY ~ L i 
ip(yt) u '(p(z)ip(yt) U) . Applying the definition of ~li with e = f = ip(yt) u ', it 
follows that ip(yt) UJ = (p(yt) u ' ip(z)(p(yt) u ' and we now have 

<p(x) = ip{x)ip{yty = ip{x)ip{yty ip^ipiyty = ip(x)ip(z)ip(yt) ul . 

10 



A* > M 

alph 7r 

V{A) ► M/~ LI 

Figure 3: M G DA = LI © J ± 

Therefore <p(x) 1Z cp(xz), which concludes the proof. □ 

A proof of the following lemma can be found in [14|, Prop. 3.6 and 
Lem. 3.7]. 

Lemma 14. Let m > 2, u,v G A*, a G A. 

1. If u >m^ n v and u = u-(iu + and v = v^av+ are a-left factorizations, 
then ii_ > m ,n-i V- and u+ > mn _ i w+. 

2. If u o m) „ v and u = U-au+ and v = v_av + are a-right factorizations, 
then u_ ^m^i-i V- and u + <m-l,n-i v + . 

Dual statements hold for u < m ,n v. 

Lemma 15. Let m,n > 2 and let u = n_au+ and v = v^av + be a-left 
factorizations. If u = mjn v, then u_ = m _i )n _i w_ and u + = m . n -i v + . A 
dual statement holds for the factors of the a-right factorizations of u and v. 

Proof. We first show u_ =m-i,n-i u_. Consider a ranker r G Rm-i,n-i} 
supposing first that r G R^-i n -i- Then r is defined on U-. if and only if r 
is defined on u and ord(r'(tt),X a (it)) is < for every nonempty prefix r' of r. 
By definition of = m ,n> this is equivalent to r being defined on u_. If instead 
r G Rm-i n-li then r is defined on u_ if and only if X a r G -R m , n is defined 
on u and ord(X a r'(u),X a (u)) is < for every nonempty prefix r' of r. Again, 
this is equivalent to r being defined on V- since u = m ,ra v. Thus, the same 
rankers in R m _i tn _i are defined on u_ and u_. 

Now consider rankers r G R^-i n -i and s e -^m-i n-2i which we can 
assume to be defined on both u_ and i>_. Then the order types induced by r 
and s on u_ and t>_ are equal, since ord(r(n_), s(u-)) = ord(r(u),X a s(u)) = 
ord(r(v),X a s(v)) = ord(r(w_),,s(i>_)) and X a s G R^n-i- 

The same reasoning applies if r G R^-i n -i an d s G -R* _i n _2 ( r esp. if 

r € #*_1,„_1 and s G ^-l,n-2> if r G R m-l,n-l and s € #™_ 2 ,n-2) since 

in that case, ord(r(u_), s(«_)) = ord(X a r(u), s(u)) (resp. ord(r(u), s(u)), 
ord(X a r(u),X a s(u))). Therefore, u_ = m _i )n _i t>_. 

11 



We now verify that u + = mi „_i v+. The proof is very similar to the first 
part and deviates only in technical details. Consider a ranker r G R m ^ n -i, 
say, in R% in _ 1 . Then r is defined on u + if and only if X a r G -R m ,n is 
defined on n and ord(X a r'(n),X a (n)) is > for every nonempty prefix r' of r. 
Again, this is equivalent to r being defined on v + since u = mn v. If instead 
r G R^ nn _ 1 , then r is defined on n + if and only if r is defined on u and 
ord(r'(n),X a (n)) is > for every nonempty prefix r' of r, which is equivalent 
to r being defined on v+. Thus, the same rankers in i? mj „_i are defined on 
u + and v + . 

Now consider rankers r G -R* n _i and s G fi^ n _ 2 , both defined on n + 
and n + . Then the order types induced by r and s on n + and n+ are equal, 
since ord(r(n + ), s{u + )) = ord(X a r(u), s(u)) and X a r G -R* n- 

Again, a similar verification guarantees that the order types induced by 
r and s on n+ and v + are equal also if r G R^n-i an< i s ^ ^\kn-2-> or ^ 
r G ^m,n-i and s e ^-i,n-2. or if r G i?^ n _! and s G ^-i.n-2- This 
shows n+ =m, n -i ^+ which completes the proof. □ 

Lemma 16. Letm,n > 2 and letu = U-auobu + andv = v-avobv+ describe 
b-left and a-right factorizations (that is, a alph(no&n + ) U a\ph(vobv+) and 
b G - alph(n_ano) U a\ph(v-avo)). If u = m ,n v, then no = m -_i in -i vo- 

Proof. A ranker r G fi^j n _ 1 is defined on uq if and only if Y a r G R m ,n is 
defined on n and ord(Y a r'(n), Y a (n)) is > and ord(Y a r'(it),X&(it)) is < for 
every nonempty prefix r' of r. Similarly, a ranker r G R^_j n-1 is defined 
on no if and only if X^r G R m ,n is defined on n and ord(Xfc r'(n), Y a (n)) 
is > and ord(X&r'(n),X&(n)) is < for every nonempty prefix r' of r. Thus, 
if n = m , n ^> then the same rankers in R m _i iTl _i are defined on no and vq. 

Now consider rankers r G -R* _i n _i and s G R^-i n -2 ( re sp. r G 
Rm-i n-i and s G Rm-i n-2)' defined on both no and i>o- Then ord(r(no), s(uq)) 
ord(Y a r(n),X fe s(n)) (resp. ord(X fe r(n), Y a s(u)). Since n = m>n n, Y a r G 
i?^ n and X b s G -R* ni (resp. X fe r G i?^ n and Y a s G i?^ ni ), the order 
types defined by r and s on no and vq are equal. 

If m = 2, we are done proving that no = m -i,n-i vo- We now assume 
that m > 3. Let r G #*_i >n _i and s G R^ 2 ,n-2 ( res P- r G ^I-i.n-i 
and s G R„\-2 n-2) be defined on both no and vq. Then ord(r(no), s(uq)) = 
ord(Y a r(n), Y a s(u)) (resp. ord(X^ r^jX^ s(u))). By the same reasoning as 
above, the order type defined by v on no and vq is the same since Y a r G 
Rm, n and Y a s G ^_i >n _i (resp. X 6 r G i?* n and X(,s€ ^m-i,n-i)- Tnis 
concludes the proof of the lemma. D 
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A /^m,n 




m—l,n 



Figure 4: A commutative diagram 



3.2 Proof of Proposition [Til 

The proof is by induction on m. We already observed that L is F0 1 [<]- 
definable if and only if it is piecewise testable, if and only if it is accepted 
by a monoid in J. Since J = R2 n L2, Proposition [TT] holds for m = 1. We 
now assume that m > 2. 

Let <p: A* ->Mbea morphism with M £ R m +i H Lm + 1. We note 
that it suffices to prove Proposition [11] for the morphism ipf : A* — > M x 2 
given by if'(u) = (tp(u),a\ph(u)). Observe that, for u,v £ A*, 

(f'(u) ~d tp'(v) (resp. (p'(u) ~k y (v)) implies alph(u) = alph(u). (1) 

Indeed we have iff (u)ip' (u) 1 ^ = <ff(v) w (since M is aperiodic): then if'(u) ~d 
<f'(v) implies that if' (v)if' '(u) u = <f r (u)if r (v) w and by definition of ipf , alph(w) 
is contained in alph(n). By symmetry, u and v have the same alphabetical 
content and the same holds for ~k- 

To lighten up the notation, we dispense with the consideration of if' and 
we assume that tp satisfies Property (1). 

Let 7Td : M —> M/~d and vtk : M — > M/~k be the natural morphisms. 
By definition of R m +i and Lm + 1, we have M/~d £ Rm and M/~k £ Lm. 
Let p = 7Td o if and A = 7Tk ° 93, see Figure HI The monoid A*/(= p V =a) 
is a quotient of both M/~d and M/~k ; so A*/(= p V =>) € R m n Lm and 
there exists n > 1 such that 

• > m ,n is contained in = p and < m ,n is contained in =\ (by Proposition H]), 



-m— 1 



n is contained in = p V =\ (by induction). 
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We show that = m ,n+2\M\ is contained in = {p . Let u = m ,n+2\M\ v - Con- 
sider the H- factorization of u, i.e., u = s\a\ ■ ■ ■ Ska^Sk+i with Oj E A and 
Sj E ^4* such that 1 = <p(si) and for all 1 < % < k: 

(p(s\ai ■ ■ ■ s^ >tz ip{s\a\ ■ ■ ■ Siai) TZ <p(siai ■ ■ ■ SiOiSi+i). 

Since the number of 7£-classes is at most \M\, we have k < \M\. Similarly, 
let v = t\b\ ■ ■ ■ tybyty + i with bi E A and tj E A* be the /^-factorization of 
v such that cp(tk'+i) = 1 and for all 1 < i < k': 

<p(tibiti+i • • • bk'tk'+x) £ <p(biti+x ■ ■ ■ bk'tk'+i) <c <f(ti+i ■ ■ ■ bk'tk'+i)- 

As before, we have k' < \M\. By Lemma[T3l (applied with x = S\ ■ ■ ■ Sj_iaj_i, 
y = Si and z = a«), we have a% alph(sj); and similarly, bi alph(tj + i). 
Therefore, the positions of the Oj's in u are exactly the positions visited by 
the ranker r = X ai • • • X afe , and the positions of the b^s in v are exactly the 
positions visited by the ranker s = Yb k , ■ ■ ■ Y^. Since u = nhn +2\M\ v > each of 
the rankers r and s is defined on both u and v, and all the positions visited 
by the rankers r and s occur in the same order in u as in v. We call these 
positions special. Let 

U = U\C\ ■ ■ ■ U£C£U e+ i 
V = V\C\ ■ ■ ■ V£C£V£+1 

be obtained by factoring u and v at all the special positions. We have 
£ < k + k! < 2 \M\. We say that a special position is red if it is visited by 
r, and that it is green if it is visited by s. Some special positions may be 
both red and green, which means that more than one of the cases below may 
apply. 

For u the above factorization is a refinement of the 7£-factorization; and 
for v it is a refinement of the /^-factorization. In particular, <p(ui) = 1, 
<p(ve+i) = 1 and 

(f(ui ■ ■ ■ Ui_iCj_i) 1Z (f(ui ■ ■ ■ Ui-iCi-iUi) for 1 < i < i + 1, (Eq(7£)) 
ip(viav i+1 • • • q) C ip(ciV i+ i ■ ■ ■ c e ) for 1 < i < £. (Eq(£)) 

In order to prove u = v v, we show that we can gradually substitute U{ for Vi 
in the product V\C\ ■ ■ ■ V£C£Vi + i = v, starting from i = 1, while maintaining 
^-equivalence. Namely we show that, for each i, it holds 

u\ ■ ■ ■ Ui-iCi-i Ui CiV i+ i ■ ■ ■ V£ + i = v u\--- Ui^ia-i Vi CiV i+ i ■ ■ ■ v e+1 . (Eq(i)) 

14 



Let ho be the leftmost red position: then Ch = a\ and s\ = u\C\ ■ ■ ■ Uh - 
Since (p(s\) = 1 and M is aperiodic, the ip-im&ge of every letter in si is 1. 
Applying Lemma [15] to the ai-left factorizations of u and v, we find that 
u\C\ ■ ■ ■ m/kj-i = m -i,n-i v\c\ ■ ■ ■ Vh -i and in particular, these words have 
the same alphabet. It follows that <p(ui) = <p(vi) = 1 for all i < ho, and 
hence (Eq(i)) holds for all i < ho- 

The right-left dual of this reasoning establishes that i/?(uj) = </?(fi) = 1 
for all the Ui, Vi to the right of the last (rightmost) green position, say jo. In 
particular, (Eq(i)) also holds for all i > jo- 

We now assume that ho < i < jo and we let h — 1 be the first red position 
to the left of i and j be the first green position to the right of i: we have 
ho < h < i < j < jo- 

Case 1: h = i (i — 1 is red) We have u o m)n +2|M| v - By Lemma fl4l(fT1). a 
sequence of at most i — 1 left-factorizations yields UiOi ■ ■ ■ U£ + i > m ,n+2\M\-i+i 
ViCi ■ ■ ■ V£+\. If i is red, then by Lemma [T4l(fT1). after one Cj-left-factorization, 
we see that u« > m ,n+2\M\-i v i- If i is n °t re d, then i is green and by 
Lemma rn] ([2]), after at most t — i right-factorizations, we find that Ui and 
Vi are > mn+2 |M|-i-(^-i)- ec l u i va l en t- I n an Y case, we have Ui > m ,n Vi and 
thus Ui = p Vi (i.e., (p(ui) ~d V^i)) by the choice of n. In view of (Eq(£)), 
Lemma [121 now implies 

UiCiV i+ l ■ ■ ■ C£V£+1 = v ViCiV i+ l ■ ■ ■ C£V£ + l 

and left multiplication by u\C\ ■ ■ ■ Ci-\ yields (Eq(z)). 

Case 2: j = i (i is green) As in Case 1, we see that Ui =\ Vi. (Eq(7^-)) 
and Lemma [T2l then imply 

U\Ci ■ ■ ■ Ui-lCi-lUi = v U\C\ ■ ■ ■ Ui-iCi-iVi, 

and right multiplication by CiVi + \ ■ ■ ■ vg + i yields (Eq(i)). 

Case 3: h < i < j (i — 1 is not red and i is not green) By Lemma [T5l 
after at most h — 1 left factorizations and £ — j + 1 right factorizations, we 
obtain u/jC/j • • • Uj = m:n+ j-h v^c^ ■ ■ ■ v j (since n + j — h < n + 2 \M\ — (h — 
1) — (£ — j + 1)). Lemma [To] applied with a = Cj_i and b = Ci, then yields 
Ui = m ~i,n Vi- Since = m -i,n is contained in =\ V = p , there exist words 
w\ , . . . , Wd such that 

Vi = W 1 = p W 2 =\VJ 3 = p --- = X W d -2 =p Wd-l =X Wd = Ui- 
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After the discussion at the beginning of this section, we have alph(fj) = 
alph(w2) = ••• = a\pb(wd-i) = alph(-Uj). Thus, by Lemma 1131 we have 
<p(pUi) 1Z ip(p) if and only if (p(pw g ) 1Z <p(p), and tpiyiq) C ip(q) if and only 
if (p(w g q) C (p(q) for all p,q £ A*. As in Cases 1 and 2, we conclude that for 
each 1 < e < d, 

• if w e = p u> e +i, then 

w e Ci ■ ■ ■ C£V£ + i =tp w e+ iCi ■ ■ ■ cev e+1 , and thus 
U\C\ ■ ■ ■ Uid-iWeCi ■ ■ ■ C£V£ + i = v u\C\ ■ ■ ■ UiCi-iw e+ iCi ■ ■ ■ c e v e+1 ; 

• and if w e =\ w e+ \, then 

u\C\ ■ ■ ■ Ci-\w e =tp u\C\ ■ ■ ■ Ci-\w e +i, and thus 
u\c\ ■ ■ ■ Ci-iw e CiV i+ i ■ ■ ■ C£V£ + i = v u\c\ ■ ■ ■ Ci-iw e+ iCiV i+ i ■ ■ ■ cev e+ i. 

It follows by transitivity of = v that (Eq(i)) holds. 

Concluding the proof We have now established (Eq(i)) for every 1 < 
i < £ + 1. It follows immediately, by transitivity, that u = v v. D 

4 Conclusion 

We have shown that for each m > 1, it is decidable whether a given regular 
language is FO m [<]-definable. Previous results in the literature only showed 
decidability for levels 1 and 2 of this quantifier alternation hierarchy. Our 
decidability result follows from the proof that FO^ (the pseudovariety of 
finite monoids corresponding to the FO^J<]-definable languages) is equal to 
the intersection R m +i n Lm + 1, which was known to be decidable. 

This result implies the decidability of the levels of the hierarchy given by 
Vi = J and V m+ i = V ** J, since Straubing showed that V m = FO^ [27]. 
Straubing used general results of Almeida and Weil on two-sided semidirect 
products to deduce from this that FO| is decidable, but these results do not 
extend to FO^ when m > 2 ([HE7J, see [STJ Sec. 5] for a discussion). 

We also showed that the decision procedure whether a regular language 
L is FO m -definable, is in Logspace on input the multiplication table of the 
syntactic monoid of L, and in Pspace on input the minimal automaton of L. 
The result behind this statement is the fact that membership in R m and in 
Lm is characterized by a small set of (rather complicated) identities. Straub- 
ing conjectured a different and simpler set of identities (Conjecture!?] above). 
Our results do not confirm this conjecture, which it would be interesting to 
settle. 
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